In computer programming, scope is an enclosing context where values and expressions are associated. Various programming languages have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them—or semantics. Typically, scope is used to define the extent of information hiding—that is, the visibility or accessibility of variables from different parts of the program. Scopes can:
A namespace is a scope that uses the enclosing nature of the scope to group logically related identifiers under a single identifier. Thus, scopes can affect the name resolution for their contents.
Variables are associated with scopes. Different scoping rules affect how local variables are bound. This has different consequences depending on whether the language has static (lexical) or dynamic scoping.
Contents |
Lexical scoping was used for ALGOL and has been picked up in most other languages since then. Static (lexical) scoping was also introduced in LISP 1.5 (via the Funarg device developed by Steve Russell, working under John McCarthy). The original Lisp interpreter (1960) and most early Lisps used dynamic scoping, but descendants of dynamically scoped languages often adopt static scoping; Common Lisp has both dynamic and static scoping while Scheme uses static scoping exclusively. Perl is another language with dynamic scoping that added static scoping afterwards. Languages like Pascal and C have always had lexical scoping, since they are both influenced by the ideas that went into ALGOL 60 (although C did not include lexically nested functions).
The following example shows various scopes declared in the language C#:
namespace N { // namespace scope, merely groups identifiers class C { // class scope, defines/declares member variables and functions void f (bool b) { // outermost block (function) scope, contains executable statements if (b) { // inner block scope for conditionally executed statements // (Note, both block scopes are unnamed.) ... } } } }
One of the basic reasons for scoping is to keep variables in different parts of the program distinct from one another. Since there are only a small number of short variable names, and programmers share habits about the naming of variables (e.g., i
for an array index), in any program of moderate size the same variable name will be used in multiple different scopes. The question of how to match various variable occurrences to the appropriate binding sites is generally answered in one of two ways: lexical scoping and dynamic scoping.
With lexical scope, a name always refers to its (more or less) local lexical environment. This is a property of the program text and is made independent of the runtime call stack by the language implementation. Because this matching only requires analysis of the static program text, this type of scoping is also called static scoping. Lexical scoping is standard in all ALGOL-based languages such as Pascal, Modula2 and Ada as well as in modern functional languages such as ML and Haskell. It is also used in the C language and its syntactic and semantic relatives, although with different kinds of limitations. Static scoping allows the programmer to reason about object references such as parameters, variables, constants, types, functions, etc. as simple name substitutions. This makes it much easier to make modular code and reason about it, since the local naming structure can be understood in isolation. In contrast, dynamic scope forces the programmer to anticipate all possible dynamic contexts in which the module's code may be invoked.
For example, consider the following program fragment in Pascal:
program A; var I:integer; K:char; procedure B; var K:real; L:integer; procedure C; var M:real; begin (*scope A+B+C*) end; begin (*scope A+B*) end; begin (*scope A*) end.
The variable I
is visible at all points, because it is never hidden by another variable of the same name. The char
variable K
is visible only in the main program because it is hidden by the real
variable K
visible in procedure B
and C
only. Variable L
is also visible only in procedure B
and C
but it does not hide any other variable. Variable M
is only visible in procedure C
and therefore not accessible either from procedure B
or the main program. Also, procedure C
is visible only in procedure B
and can therefore not be called from the main program.
There could have been another procedure C
declared in the program outside of procedure B
. The place in the program where "C
" is mentioned then determines which of the two procedures named C
it represents, thus precisely analogous with the scope of variables.
Correct implementation of static scope in languages with first-class nested functions is not trivial, as it requires each function value to carry with it a record of the values of the variables that it depends on (the pair of the function and this environment is called a closure). Depending on implementation and computer architecture, variable lookup may become slightly inefficient when very deeply lexically nested functions are used, although there are well known techniques to mitigate this. Also, for nested functions that only refer to their own arguments and (immediately) local variables, all relative locations can be known at compile time. No overhead at all is therefore incurred when using that type of nested function. The same applies to particular parts of a program where nested functions are not used, and, naturally, to programs written in a language where nested functions are not available (such as in the C language).
With dynamic scope, each identifier has a global stack of bindings. Introducing a local variable with name x
pushes a binding onto the global x
stack (which may have been empty), which is popped off when the control flow leaves the scope. Evaluating x
in any context always yields the top binding. In other words, a global identifier refers to the identifier associated with the most recent environment. Note that this cannot be done at compile-time because the binding stack only exists at run-time, which is why this type of scoping is called dynamic scoping.
Generally, certain blocks are defined to create bindings whose lifetime is the execution time of the block; this adds some features of static scoping to the dynamic scoping process. However, since a section of code can be called from many different locations and situations, it can be difficult to determine at the outset what bindings will apply when a variable is used (or if one exists at all). This can be beneficial; application of the principle of least knowledge suggests that code avoid depending on the reasons for (or circumstances of) a variable's value, but simply use the value according to the variable's definition. This narrow interpretation of shared data can provide a very flexible system for adapting the behavior of a function to the current state (or policy) of the system. However, this benefit relies on careful documentation of all variables used this way as well as on careful avoidance of assumptions about a variable's behavior, and does not provide any mechanism to detect interference between different parts of a program. Dynamic scoping also voids all the benefits of referential transparency. As such, dynamic scoping can be dangerous and few modern languages use it. Some languages, like Perl and Common Lisp, allow the programmer to choose static or dynamic scoping when defining or redefining a variable. Logo and Emacs lisp are examples of languages that use dynamic scoping.
Dynamic scoping is fairly easy to implement. To find an identifier's value, the program could traverse the runtime stack, checking each activation record (each function's stack frame) for a value for the identifier. In practice, this is made more efficient via the use of an association list, which is a stack of name/value pairs. Pairs are pushed onto this stack whenever declarations are made, and popped whenever variables go out of scope.[1] An alternate strategy that is considerably faster is to make use of a central reference table, which associates each name with its current meaning. This avoids a linear search during run-time to find a particular name, but maintaining this table is more complex.[2] Note that both of these strategies assume a last-in-first-out (LIFO) ordering to bindings for any one variable; in practice all bindings are so ordered.
An even simpler implementation is the representation of dynamic variables with simple global variables. The local binding is performed by saving the original value in an anonymous location on the stack that is invisible to the program. When that binding scope terminates, the original value is restored from this location. In fact, dynamic scope originated in this manner. Early implementations of Lisp used this obvious strategy for implementing local variables, and the practice survives in some dialects which are still in use, such as GNU Emacs Lisp. Lexical scope was introduced into Lisp later. This is equivalent to the above central reference table algorithm, except that the central reference table is simply the global variable binding environment, in which the current meaning of the variable is its global value. Maintaining global variables isn't complex. For instance, a symbol object can have a dedicated slot for its global value.
Dynamic scoping provides an excellent abstraction for thread local storage, but if it is used that way it cannot be based on saving and restoring a global variable. A possible implementation strategy is for each variable to have a thread-local key. When the variable is accessed, the thread-local key is used to access the thread-local memory location (by code generated by the compiler, which knows which variables are dynamic and which are lexical). If the thread-local key does not exist for the calling thread, then the global location is used. When a variable is locally bound, the prior value is stored in a hidden location on the stack. The thread-local storage is created under the variable's key, and the new value is stored there. Further nested overrides of the variable within that thread simply save and restore this thread-local location. When the initial, outer-most override's scope terminates, the thread-local key is deleted, exposing the global version of the variable once again to that thread.
This example compares the consequences of using static scope and dynamic scope. Observe the following code, in a C-like language:
int x = 0; int f() { return x; } int g() { int x = 1; return f(); }
With static (lexical) scoping, calling g will return 0 since it has been determined at compile time that the expression x in any invocation of f will yield the global x binding which is unaffected by the introduction of a local variable of the same name in g.
With dynamic scoping, the binding stack for the x identifier will contain two items when f is invoked from g: the global binding to 0, and the binding to 1 introduced in g (which is still present on the stack since the control flow hasn't left g yet). Since evaluating the identifier expression by definition always yields the top binding, the result is 1.
In the language Perl, variables can be defined with either static or dynamic scoping. Perl's keyword "my
" defines a statically scoped local variable, while the keyword "local
" defines a dynamically scoped local variable.[3] This allows for further clarification with practical examples of each scoping model.
$x = 0; sub f { return $x; } sub g { my $x = 1; return f(); } print g()."\n";
The example above uses "my
" for static scoping of g's local variable $x. As above, calling g returns 0 because f cannot see g's variable $x, so it looks for the global $x.
$x = 0; sub f { return $x; } sub g { local $x = 1; return f(); } print g()."\n";
In this alternative, "local
" is used to make g's $x dynamically-scoped. Now, calling g yields 1 because f sees g's local variable by looking down the execution stack.
In other words, the dynamically-scoped variable $x is resolved in the environment of execution, rather than the environment of definition.